Assessing the reproducibility of discriminant function analyses
نویسندگان
چکیده
Data are the foundation of empirical research, yet all too often the datasets underlying published papers are unavailable, incorrect, or poorly curated. This is a serious issue, because future researchers are then unable to validate published results or reuse data to explore new ideas and hypotheses. Even if data files are securely stored and accessible, they must also be accompanied by accurate labels and identifiers. To assess how often problems with metadata or data curation affect the reproducibility of published results, we attempted to reproduce Discriminant Function Analyses (DFAs) from the field of organismal biology. DFA is a commonly used statistical analysis that has changed little since its inception almost eight decades ago, and therefore provides an opportunity to test reproducibility among datasets of varying ages. Out of 100 papers we initially surveyed, fourteen were excluded because they did not present the common types of quantitative result from their DFA or gave insufficient details of their DFA. Of the remaining 86 datasets, there were 15 cases for which we were unable to confidently relate the dataset we received to the one used in the published analysis. The reasons ranged from incomprehensible or absent variable labels, the DFA being performed on an unspecified subset of the data, or the dataset we received being incomplete. We focused on reproducing three common summary statistics from DFAs: the percent variance explained, the percentage correctly assigned and the largest discriminant function coefficient. The reproducibility of the first two was fairly high (20 of 26, and 44 of 60 datasets, respectively), whereas our success rate with the discriminant function coefficients was lower (15 of 26 datasets). When considering all three summary statistics, we were able to completely reproduce 46 (65%) of 71 datasets. While our results show that a majority of studies are reproducible, they highlight the fact that many studies still are not the carefully curated research that the scientific community and public expects.
منابع مشابه
Pinpointing the classifiers of English language writing ability: A discriminant function analysis approach
The major aim of this paper was to investigate the validity of language and intelligence factors for classifying Iranian English learners` writing performance. Iranian participants of the study took three tests for grammar, breadth, and depth of vocabulary, and two tests for verbal and narrative intelligence. They also produced a corpus of argumentative writ...
متن کاملPossibility and Reproducibility of Renal Assessing and Size Measurement by Three- Dimensional vs Two- Dimensional Ultrasounographv in Dogs
Objective- To determine the possibility and reproducibility of three-dimensional ultrasonography (3DUS)and comparison of the achieved measurements to normal two-dimensional ultrasonography (2DCS)Design- Descriptive studyAnimals- 10 young mixed normal dogs, age 1.5-2.5 year, weighing 9.7-12 kg Procedure- Renal width, length, and depth measured in coron al and transverse sections. The measurement...
متن کاملCanonical Analysis for Assessment of Genetic Diversity of Three Indigenous Chicken Ecotypesin North Gondar Zone, Ethiopia
Rapid exploratory field survey, to identify indigenous chicken ecotypes was conducted in north Gondar zone of Ethiopia. Chicken ecotypes including Necked neck, Gasgie and Gugut from Quara, Alefa and Tache Armacheho districts were identified, respectively. Morphological variations among the three study populations and nine measurable traits were evaluated. General linear model, canonical discrim...
متن کاملFinancial crisis and exchange market pressure In energy exporting countries: Fisher's discriminant function approach
Financial crises are unpredictable and threatening the economic stability of countries. Hence, policymakers are forced to adopt appropriate tactics to defuse and resolve crises. One of the indicators that helps policymakers and economists is the exchange market pressure. The purpose of this study is to examine the factors affecting the foreign exchange market pressure during 2008- 2009 financia...
متن کاملDiscriminant Analysis for ARMA Models Based on Divergency Criterion: A Frequency Domain Approach
The extension of classical analysis to time series data is the basic problem faced in many fields, such as engineering, economic and medicine. The main objective of discriminant time series analysis is to examine how far it is possible to distinguish between various groups. There are two situations to be considered in the linear time series models. Firstly when the main discriminatory informati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 3 شماره
صفحات -
تاریخ انتشار 2015